Explainable death toll motion modeling: COVID-19 data-driven narratives

Models have gained the spotlight in many discussions surrounding COVID-19. The urgency for timely decisions resulted in a multitude of models as informed policy actions must be made even when so many uncertainties about the pandemic still remain. In this paper, we use machine learning algorithms to build intuitive country-level COVID-19 motion models described by death toll velocity and acceleration. Model explainability techniques provide insightful data-driven narratives about COVID-19 death toll motion models—while velocity is explained by factors that are increasing/reducing death toll pace now, acceleration anticipates the effects of public health measures on slowing the death toll pace. This allows policymakers and epidemiologists to understand factors driving the outbreak and to evaluate the impacts of different public health measures.


Introduction
What are the key factors accelerating the number of deaths due to COVID-19? What is the impact of public health countermeasures such as travel bans and movement restriction? Does mass social isolation effectively help decreasing the death toll rate? What would be the impact on death toll if public health countermeasures are relaxed? Answers to these questions may lead to timely and informed policy decisions, but most of them are still open to debate.
COVID-19 seems to be harder to understand than previously appreciated, and a general law or the complete causal chain behind death toll evolution seem to be unfeasible to obtain in the short term. By contrast, there are multiple competing explanations being reported that attribute different causes to death toll evolution in different countries. While these partial explanations are often guided by our standard understanding of the spread of infectious diseases, the COVID-19 death toll increases and decreases in myriad different ways depending on characteristics and particularities of each country, and there is not a sole cause or even one major explanatory factor.
Here we propose to model the COVID-19 death toll evolution by using the concept of motion in which velocity and acceleration are defined as follows: ).
• Acceleration is equal to the slope of the velocity as a function of time. Thus, acceleration is the second order derivative of the number of deaths ( deaths day 2 . ).
Figs 1-3 show death toll, velocity and acceleration for eight countries: Belgium, Brazil, France, Germany, Italy, Sweden, UK and USA, as of June 5th 2020. While our analysis encompasses a much larger set of countries, we show only eight curves in order to avoid clutter. For each country, our analysis starts on the day of the first death, which is hereafter referred to as the first outbreak day. Commonly, velocity increases at an alarming exponential rate, but then it follows an approximate bell shaped curve [1]. The steepness of the curve is determined by how quickly the death toll increases. The peak of velocity coincides with the moment in which acceleration is exactly zero.
In the figure, curves are aligned by the number of days after the first death reported in the corresponding country. Clearly, COVID-19 death toll evolves differently in each country, with some showing flattened curves but others not. The factors driving the pandemic evolution in each country may vary greatly depending on their specificities and on the countermeasures adopted at each one. By modeling velocity and acceleration across different economy, demography, social, and climate contexts we may unveil non-obvious patterns driving the COVID-19 death toll evolution. Countermeasures (or the lack of them) are often among the factors impacting acceleration the most. Velocity, on the other hand, is mostly impacted by socioeconomic and demographic factors.
In summary, the main contributions of this paper are: • We model the COVID-19 death toll evolution using the concepts of velocity and acceleration. We build models that predict velocity ( deaths day . ) and acceleration (Δ velocity) to be observed at an arbitrary day of the outbreak. Model predictions come with their explanations, thus providing insightful narratives on why the models behave the way they do. This is an enabling tool for policymakers to understand high-level reasoning behind death toll evolution, so that our models are specially valuable as a planning tool for government officials who need to know about the factors driving the likely trajectory of COVID-19 deaths in their countries. Explanations for acceleration may point to course correction measures that need immediate attention in order to avoid disastrous situations, while explanations for velocity usually point to more intricate factors such as urbanization problems, typical comorbidities, and social inequality issues.
• Our models combine over 200 different factors in order to provide powerful predictions and insightful explanations for death toll evolution. The multitude of factors reflects the interdisciplinary aspects of the pandemic evolution. Clearly, COVID-19 death toll evolution is a complex phenomenon in which factors related to diverse disciplines interact in nonobvious ways, finally resulting in the observed death toll.

Materials and methods
In this section we present the data we use in this study and then we discuss how our models were built.

Data
We employed different sources of data to build our models: (i) the daily COVID-19 death toll curve in each country, (ii) country's countermeasures in response to COVID-19 pandemics, (iii) community mobility reports, (iv) estimations of critical care beds available for and needed by COVID-19 patients in each country, and (v) country's development indicators. Next, we present details about the data. COVID-19 death toll curve. Data concerning the evolution on the number of deaths comes from case reports provided by the European Center for Disease Prevention and Control (ECDC). The data was frozen on June 5th 2020, comprising 8,926,399 cases and 468,257 deaths reported worldwide. Death reports from 211 countries are presented daily. However, these reports have some issues, including discrepancies in reporting practices across countries. Another issue is the delayed communication of the death, which may result in abrupt variations in consecutive days of the death toll curve of a country. That is, days with under-reported number of deaths are followed by days with over-reported number of deaths. To avoid overfitting the original death toll curve with unnecessarily complex models, we smoothed it using the 7-day rolling average, and from the resulting curve we finally get daily values of velocity ( deaths per 100k = day ) and acceleration ( deaths per 100k = day 2 ). We define velocity and acceleration on Eqs 1 and 2, respectively.
where t is the number of days after the first death. Country's countermeasures in response to COVID-19 pandemics. Measures are implemented to slow the spread of the virus by enforcing physical distance between people. The dates in which these countermeasures were taken in response to COVID-19 pandemics come from the Oxford COVID-19 Government Response Tracker. Data is collected from public sources by a team of Oxford University students and staff from every part of the world [2]. Table 1 shows data about the different measures adopted in response to the COVID-19 pandemics. The value that a countermeasure receives on a specific day is given as the number of days during the outbreak for which it has being taken (or not taken).
To compare the set of countermeasures adopted by each country, we apply the t-Distributed Stochastic Neighbor Embedding (t-SNE) method [3], which embeds the 47 measures present in Table 1 into a 2D space. t-SNE works by minimizing the divergence between two distributions: one that measures pairwise similarities of the input objects and another that measures pairwise similarities of the corresponding low-dimensional points in the embedding. As a result, both X and Y axes are non-linear combinations of the original 47 dimensions. Values correspond to the number of days that the corresponding measure has being taken. For each countermeasure it is shown, at the peak of velocity on the country, the median followed by first and third quartiles (calculated over the 211 countries considered in the study). In the figure, trajectories are shown over a heatmap composed of clear and dark regions. When the trajectory is passing over a dark region, the corresponding measures are less stringent. By contrast, a trajectory encompasses stringent measures when it passes over a lighter region. Further, the thinner part of a trajectory indicates an abrupt variation on the direction due to changes on the adopted measures. Thicker parts, on the other hand, indicate consecutive days adopting the same set of measures. Finally, countries placed next to each other are currently adopting a similar set of measures. The figure shows a large variation in government responses, with some countries deciding to rise the stringency of response, while others keeping less stringent measures. We follow the definition of stringency presented in [2], that is, the overall stringency level of a country on a given day is calculated by taking the average of the stringency levels of the following nine measures: school closing, workplace closing, public  events cancelling, gathering size restrictions, public transportation closing, stay at home requirements, restrictions on internal movement, restrictions on international travel, public information campaign. The stringency of each individual measure on a day takes a value between 0 and 100. For instance, in Table 1, measures of different types are sorted according to their stringency level (i.e., "No measures" has stringency 0, while the last measure has stringency level of 100). The index does not provide information on how well policies are enforced, nor does it capture demographic or cultural characteristics that may affect the spread of COVID-19. Finally, the overall stringency is given as:

Schools and universities
where S i is the stringency associated with measure i on a given day. Notice that the stringency of a measure is discretized, for instance, in case of a four-level measure, its stringency level may assume only the values 0, 100 3 = , 200 3 = , and 100. Finally, as the nine measures that compose the stringency computation are within the 47 dimensions previously discussed in Table 1, the stringency level is easily transferred as colors in Fig 4. While we only show trajectories for few countries, it is worth mentioning that the colormap was built using the entire set of trajectories for all 211 countries considered in our analysis. Therefore, points are spread all over the space, and these points were interpolated in order to produce a smoother colormap. Fig 4 shows trajectories for several countries. The start point of a trajectory corresponds to the first outbreak day of the country, and thus it varies for each country. The end point of a trajectory is June 5th 2020, thus being the same for all countries. Fig 5 shows plots for specific countries, namely: Canada, Ecuador, Sweden and Switzerland, which have the same heatmap as background. We can see that Canada has a trajectory passing over a dark region and a long thicker part, meaning less stringent measures and consecutive days adopting the same set of measures, such as recommend closing schools, universities, workplaces, cancelling events, limit private gatherings, closing public transport, require not leaving home and not to travel. On the other hand, Ecuador and Switzerland not only have an abrupt change of direction due to changes on the adopted measures but also their trajectories passes over a lighter region and thus encompass stringent measures. Sweden presents a more straight line meaning consecutive days adopting the same set of measures, such as lack of countermeasures which led to to be an outlier with cases and deaths increasing more rapidly than in its Nordic neighbours, explained by their COVID-19 strategy and the governance of the health system that has enabled the strategy to continue without major course corrections.
COVID-19 community mobility reports. Human mobility is associated with growth and decline of new cases of COVID-19 [4]. Fig 6 shows how mobility changes as a function of the level of stringency being adopted. As expected, human mobility reduces as stringency increases. The data used to draw the curves in Fig 6 comprises the number of visitors to specific categories of location (e.g., grocery stores, parks, subway and train stations) every day and compares this change relative to a baseline day before the pandemic outbreak. Google Maps [5] provides anonymized data showing how peoples' movements have changed throughout the pandemic. Baseline days represent a normal value for that day of the week, given as median value over the five-week period from January 3rd to February 6th 2020, extracted from Google Maps data. The data also comprises the change in average duration spent in residences during the outbreak. We use an offset of three weeks in order to better correlate with the number of deaths by taking into account an approximation of incubation and treatment period lengths.
Critical care beds available and needed by COVID-19 patients. We use estimations from the Institute for Health Metrics and Evaluation (IHME, University of Washington).
These estimations are for COVID-19 patients exclusively, and thus non-COVID patient needs were excluded. Also, the estimation of critical care beds available for COVID-19 patients is obtained by excluding the typical percentage of beds occupied by other patients and emergencies. With these estimations we calculate the ratio between the total number of critical care beds needed and the total number of critical care beds available (in the country).
Country's development data. We use data from the World Bank, which provides a comprehensive dataset on the development of countries around the globe. Table 2 shows a partial list of indicators about health system, climate, socio-economics and demographics for all countries involved in this study. Both temperature and relative humidity are given as the average of the respective values measured in the ten most affected cities in each country, and their values are taken three weeks prior to the current date in order to better correlate with the number of deaths by taking into account an approximation of the incubation and treatment periods. Table 3 shows data about age structure, sex-ratio and share of deaths by age. It is worth mentioning that men aged 70 or more correspond to the sub-population at highest risk. Finally, Table 4 shows a partial list of the risk factors and comorbidities. Our intuition is to correlate COVID-19 death toll motion with an approximation of the size of at-risk group within each country. Prevalence of smoking is based on the population aged 15 years and older who smoke any form of tobacco, including cigarettes, cigars, pipes or any other smoked tobacco products. Obesity is defined as having a body-mass index (BMI) equal to or greater than 30. Diabetes prevalence refers to the percentage of people aged between 20 and 79 years who have type 1 or type 2 diabetes. AIDS prevalence is based on the population aged between Table 2. Data on health system, climate, socioeconomics and demographics. For each item it is shown, at the peak of velocity on the country, the median followed by first and third quartiles (calculated over the 211 countries considered in the study).

Models
Our models are particularly useful in situations where many intuitive factors interact in nonobvious ways, which seems to be the case with COVID-19. Specifically, our models combine a multitude of factors in order to reflect the interdisciplinary aspects of the pandemic evolution. Model explainability [6] enables inspecting the high-level concepts and reasoning used by our models so that we may formulate possible narratives about the COVID-19 death toll motion defined in terms of velocity and acceleration. Model construction and evaluation. Our models are built using an implementation of the LightGBM algorithm [7]. The LightGBM algorithm produces a complex model composed of hundreds of simple decision trees [8] that are combined into a single model by a process known as boosting [9]. This added complexity allows our models to reach low error levels that are not likely to be obtained with simpler models. Another reason for choosing LightGBM is that it naturally deals with missing values, as some factors are not available for some countries. Basically, LightGBM ignores missing values during a split, and then allocate them to whichever side reduces the loss the most. Models are optimized either to predict the death toll velocity or the death toll acceleration to be observed in the next day. A model for each country at each outbreak day (we consider the first outbreak day as the day of the first death) is built and evaluated using discovery and validation datasets, as follows: Table 4. Data on risk factors and comorbidities. For each item it is shown the median followed by first and third quartiles (calculated over the 211 countries considered in the study).

Risk factors and comorbidities
Share of women who smoke (% of women) 12 ( • The discovery dataset for country c at day x includes data from all countries. More specifically, data is composed of all outbreak days until day x in all countries, except for country c for which data goes until day x−1 of the outbreak. Thus, the discovery datasets are almost the same for all countries, varying only the x th outbreak day which is dropped or included depending on the country (i.e., it is dropped for country c, and included for the other countries). Also, the discovery datasets keeping increasing with the outbreak. For the sake of reproducibility: our models were obtained by combining up to 100 simple decision trees with maximum tree-depth of 10, and the learning rate was set to 0.10.
• The validation dataset for country c at day x includes only the x th day of outbreak in country c. Thus, models will perform predictions for one-day ahead. The prediciton error associated with country c with respect to the validation datasets is assessed in terms of the standard error measures [10].
In summary, the model for country c at day x is obtained using data up to outbreak day x−1 from country c in addition to data from all other countries up to day x. The error figures to be reported correspond to the average of the validation loss, and no additional hold-out set was used. Each instance in both datasets corresponds to a specific day of the outbreak in a specific country, and it is represented by a superset of the factors we present in Tables 1-4. Some factors vary daily (i.e., number of critical care beds needed and available), while others remain constant during the entire outbreak period (i.e., population density). A complete list of the input factors is shown in S1 Appendix, and it includes additional factors not mentioned in Tables 1-4. These additional factors reflect the values of the previous day, so that derivative information is provided for factors that vary daily.
Algorithm1 describes the main steps of model construction for a specific country c and a specific outbreak day x. The model is built using the discovery datasets. A discovery dataset contains all factors for all countries for multiple outbreak days up to outbreak day x. Each discovery dataset has a paired validation dataset, which contains all factors of country c on a specific outbreak day x. All instances in the discovery and validation datasets also contains the velocity/acceleration information for the corresponding day. Discovery and validation datasets are used iterativelly in order to perform an exploratory search for the optimum subset of factors to compose the model for outbreak day x and country x. The exact search approach would require the exhaustive enumeration of all possible combinations of factors, so that one model is obtained for each combination in the power set. Obviously, inspecting all possible subsets of factors is computationally prohibitive. Instead, the algorithm samples the model space by randomly selecting the factors to compose a model. At the end, the best performing models are finally selected using the validation datasets and used to estimate velocity or acceleration oneday-ahead. Specifically, the selected model for outbreak day x and country c will be the one with the lowest cumulative error considering all validation datasets up to day x. Prediction error is given both in terms of the standard Mean Absolute Error and Mean Absolute Percentage Error (MAPE) measure.
Algorithm 1 samples a large number of candidate models (n = 1, 000, 000), and thus it is unlikely that (detrimental) correlated factors are included in the selected (best) model. Since correlated factors may make the estimates worse, it is expected that the best model does not include correlated factors. An important consideration about Algorithm 1 is that we assume that p is always the same across the different countries.
Algorithm 1 Model construction for country c at day . . .; f n a g with the lowest � i a Model explainability. We assume that a low-error model may be seem as a possible explanation for COVID-19 death toll velocity and acceleration in a country. However, our models are complex as they are obtained from the interplay of a large number of factors. While this provides a more realistic and unified view of COVID-19 death toll motion, understanding how each factor contributes to velocity and acceleration is thus a major challenge. We follow a prominent method that is based on Shapley values from cooperative game theory [11] in order to attribute importance to factors. Specifically, Shapley values can be used to find a fair division scheme that defines how the total importance should be distributed among the factors composing the model [6]. The method is based on the idea that: (i). a specific outbreak day x is represented by a set of factors that interact according to a model, thus resulting in the prediction f(x), and (ii). the magnitude of the difference between f(x) and the average prediction E[f(x)] (i.e., the average velocity, or the average acceleration in the discovery dataset) is divided among the factors in a unique way that is "fair" given their individual contributions to change the prediction from E[f(x)] to f(x). Thus, the average contribution each factor has on changing the model prediction is considered to be its importance score.
More formally, the method creates an explanation function g(f, x) which takes as input a model f and the values assumed by the different factors within an outbreak day x, and returns importance scores of these factors on day x. The importance score associated with a factor can be directly interpreted as the sensitivity that indicates how much the model's response will vary as the factor has its value increased or decreased. Importance scores will be visualized using two different plots: (i) Waterfall Plot − Shows importance scores for a specific country on a specific day of the outbreak. Thus, waterfall plots give a more specific, per-country explanation. Waterfall plots are designed to visually display how the importance of each factor moves the model output from a prior expectation under the discovery dataset distribution to the final model prediction given the evidence of all the factors within the model. Each waterfall plot shows the importance of a set of factors in a given day, so that: • Factors are sorted from top to bottom by the magnitude of their importance values, with factors of smallest magnitude grouped at the bottom to avoid clutter.
• The importance for each factor is shown as positive (in red) indicating that the factor contributes to increase velocity or acceleration, and negative (in blue) indicating that the factor contributes to decrease velocity or acceleration. Thus, in summary, the x−axis shows the estimated value (which can be either velocity or acceleration), while the y−axis shows the factors contributing mostly for the final estimation. The number appearing on the left of a factor indicates its corresponding value on a particular day of the outbreak (i.e., the, a quantity which can be measured or the number of days a countermeasure is being adopted). A positive value, which is shown in red, indicates that the corresponding factor contributes to increase velocity or acceleration. Negative values, which are shown in red, indicate that the corresponding factor contributes to decrease velocity or acceleration.
(ii) Summary Plot − Shows importance scores considering all countries on a specific day of the outbreak. Thus, summary plots give a more general explanation. Summary plots have five characteristics: • Each row corresponds to a factor and has as many dots as countries.
• A dot represents the value of the corresponding factor for a country. Red dots indicate that the factor assumes a high value for the corresponding country. Likewise, blue dots indicate that the factor assumes a low value.
• The x−axis position of a dot indicates the impact the corresponding factor has on velocity or acceleration. The impact of a factor is defined as how much the factor is decreasing or increasing velocity or acceleration.
• The vertical line shows whether the impact of a factor increased the prediction (i.e., the dot is on the right size) or decreased it (i.e., the dot is on the left side).
• Factors impacting most the model prediction appear on the top of the plot.

Results
In this section, we present results for some countries (for a live instance of our analysis tool with a larger set of countries, please visit https://covid-19.kunumi.com). It is worth mentioning that we did not use autoregressive time-series prediction models (e.g., ARIMA [12]) in our experiments. While this algorithm is considered one of the most used prediction models for time series, its input prevents us to collect actionable explanations as the estimations provided depend mostly on time-series and lag information, and not on actionable factors. Table 5 shows the Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) for velocity and acceleration models of some countries considering the number of days after the first death. In the table, countries are shown in ascending order of error for velocity. In general, the errors of our models (one-day ahead prediction) are very low. It is hard to grasp a clear trend on how the number of factors within each model, k, impacts the error. The best models (i.e., the ones obtained with the execution of Algorithm 1) have different values of k. The value of k seems to depend mainly on the country, but also on the outbreak day. For instance, models estimating acceleration in Brazil have values of k varying from 28 to 34 factors, depending on the outbreak day. Models estimating acceleration in Germany have values of k varying from 26 to 31 factors. For Italy, k varies from 19 to 23 factors. An interesting pattern is that models for estimating acceleration are usually composed of fewer factors than models for estimating velocity. Also, models for a country are stable in the sense that the factors within them do not change drastically on consecutive outbreak days. On the contrary, by inspecting how the factors vary on a daily-basis for a same country, we concluded that they change smoothly for consecutive days. Table 6 shows the average Jaccard Index (which is a statistic used for gauging the similarity and diversity of two sample sets) of factors within models of two consecutive days. The table shows that models are extremely similar on consecutive outbreak days, but stability decreases on the specific days just before and after the peak of velocity and acceleration. Models that estimate velocity seem to be more stable that models estimating Table 5. Model effectiveness at June 5th 2020, for some countries. The prediction error is given both in terms of the mean absolute error (MAE) and mean absolute percent error (MAPE). Error is calculated on a daily-basis and then it is averaged. Standard deviation is also shown. velocity, and this may be because models estimating acceleration employ less factors that models estimating velocity. For each country we employ model explainability so that we can discuss the main factors driving the predictions for some of these countries. The days between the peak of acceleration and the peak of velocity are of particular importance for evaluating the impact of countermeasures being adopted by public health officials. The days during this period reveal two important aspects: acceleration decreases while velocity keeps increasing, which makes the impact of countermeasures on slowing death toll pace hard to evaluate. However, we are able to present the main factors driving the decrease of acceleration during this period, thus revealing which countermeasures are being effective and also the ones that need further attention.

Model
In order to keep waterfall plots with a manageable size we restricted the number of factors that are explicitly shown to only the most important ones. The remaining factors are collapsed into a group of other explanatory factors, and thus they cannot be directly interpreted. For instance, the factor "Critical care beds needed/available" usually shows an important aspect regarding the health system situation in the current outbreak day. It is important to mention that a similar factor, namely "Critical care beds needed/available on previous day" may also compose the model, and in this case it helps by giving a trend on whether the health system situation is worsening or improving. Thus, please notice that a higher value of "Critical care beds needed/available" can contribute to decrease acceleration if the health system situation is getting better. Finally, it is also worth mentioning that explanations change smoothly on a dailybasis. However, since we are focusing our analysis on the peaks of acceleration and velocity, the corresponding plots are those obtained when acceleration changes the most.   are still ineffective as they are not yet contributing to decrease acceleration. Five days later, other measures started contributing to decrease acceleration. Finally, at the peak of velocity, most measures are contributing to decrease acceleration, the most effective seem to be self-isolation, closing public transportation, and the cancellation of public events. By contrast, the long time without taking any measure regarding closing schools/universities is now a factor contributing to increase acceleration. Further, some measures may take longer to become effective (i.e., limitation of private gatherings).

Italy, Netherlands and UK-Evaluating countermeasures (and the lack of them)
Figs 12 and 13 show observed and predicted curves for Netherlands. The peak of acceleration occurred on the 17 th day after the first death, while the peak of velocity occurred on the 28 th day. Figs 14 and 15 show the main factors explaining acceleration on the 20 th and 28 th days. Few days after the peak of acceleration (i.e., 20 th day), measures on restricting internal movement started to decrease acceleration. On the peak of velocity, several factors are contributing to decrease acceleration, but the main ones include cancelling public events and banning arrivals from affected regions. It is worth mentioning that restricting internal movement is still The peak of acceleration occurred on the 23 rd day after the first death, while the peak of velocity occurred on the 38 th day. Figs 18 and 19 show the main factors explaining acceleration on the 30 th and 38 th days. Few days after the peak of acceleration, measures such as closing schools/universities and selfisolation, started to decrease acceleration. On the peak of velocity, other measures started to become effective. Cancelling public events seems to be an important measure, which still needs additional days to finally become effective. Finally, on the 30 th day, the recommendation of closing public transportation was adopted for 17 days, and it is still contributing for increasing acceleration. However, on the 38 th day, the measure was adopted for 24 days, and started to contribute to decrease acceleration.
It is worth mentioning that Italy, Netherlands and UK have adopted countermeasures at different stages of the outbreak, and as a result the corresponding benefits took different times to appear. As expected, countermeasures adopted at earlier stages of the outbreak showed benefits earlier, as they started contributing to decrease acceleration earlier. For instance, both Italy and Netherlands required cancelling public events earlier than UK. Specifically, Italy adopted this countermeasure 5 days after the first death, and Netherlands adopted this countermeasure 4 days after the first death. UK took 8 days after the first death to start requiring the cancellation of public events. Another example concerns recommending closing public transport. Italy adopted this countermeasure 12 days before the first death, and UK adopted the same countermeasure 13 days before the first death. On the contrary, Netherlands adopted this measure 17 after the first death. For Italy and UK benefits appeared on the peak of velocity, but for Netherlands this factor is still contributing to increase acceleration at the peak of velocity.  recency of countermeasures. For instance, no public transportation measures were taken for 67 days, and this is the main factor increasing acceleration at this stage of the outbreak. A related factor is the small decrease of visitors in transit stations (-10%), which also increases acceleration.

Germany-The importance of a robust health system
Since the 20 th day marks the peak of acceleration, it is not expected to find many factors contributing to decrease acceleration. In fact, there are only three factors worth mentioning: the small proportion of urban population living in slums, the restriction on private gatherings, and most importantly, the high availability of critical care beds, indicating a robust system even during the peak of acceleration. As expected, many factors are contributing to increase velocity on the peak of acceleration. These factors are shown in Fig 22, including the distribution of urban population, male/female proportion on different ages, and risk factors such as obesity and smoking. On the other side, the high availability of critical care beds is the main factor contributing to decrease velocity. This points, again, to the importance of a prepared and robust health system in order to reduce death toll.
The 39 th day marks the peak of velocity, and factors contributing to increase velocity include typical comorbidities (i.e., stroke) and risk factors such as obesity, as shown in Fig 24. The only relevant factor contributing to decrease velocity is the high availability of critical care beds. Ensuring sufficient physical infrastructure and resources, even on the peak of velocity, is crucial for dealing with the outbreak . Fig 25 shows that on the 39 th day some factors are contributing to decrease acceleration. Stringent measures (such as the cancellation of public

PLOS ONE
COVID-19 data-driven narratives events) were adopted for a time which is now sufficient to show benefits. Further, the high availability of critical care beds keeps contributing to decrease acceleration.

Brazil-Urban slums are potential incubators of COVID-19
Figs 26 and 27 show observed and predicted curves for Brazil. A particular period of the outbreak, starting on the 31 st day and continuing until the 45 th day, was marked by a consistent increase in acceleration. The drastic increase in acceleration caused the health system to collapse, as critical care beds needed became greater than beds available. Next we discuss the main factors driving acceleration during this period.
Figs 28-30 show the main factors driving acceleration in Brazil from the 31 st to the 45 th day after the first death. Clearly, the high proportion of urban population living in slums was a major factor increasing acceleration on this period. Urban slums are characterized by lowquality housing, limited basic services and poor sanitation. This makes simple lifesaving practices like hand-washing a challenge. Further, households in urban slums are typically + 10 times denser than neighboring areas of the same city. The largest urban slum in Brazil, for instance, has a density of 39,000 inhabitants per km 2 , while the country's population density is only 25 inhabitants per km 2 . Finally, many people in slums need to work normally during the outbreak, as they often have no savings. They use public transportation over long distances,

Belgium-The risk with long-term care and nursing institutions
A high proportion of long-term care facilities have reported COVID-19 outbreaks, with high rates of morbidity and case fatality in residents. Deaths in Belgium's long-term care institutions account for one third of the COVID-19 death toll in the country, while in Switzerland, residents of long-term care institutions account for more than half of deaths due to COVID-19 [13]. Despite measures promoting social distancing, long-term care residents need care from staff who inevitably might transmit the virus. Infected asymptomatic staff members ravaged the long-term care institutions whose residents are older adults often with underlying chronic medical conditions [14]. Figs 31 and 32 show observed and predicted curves for Belgium. The curves point to two days of particular interest: the 21 st day after the first death, which marks the peak of acceleration, and the 30 th day, which marks the peak of velocity. The 30 th day is further inspected in Fig 33, which shows the main factors driving velocity on the peak of velocity. Indeed, the high number of long-term care beds in Belgium is among the factors contributing to increase velocity on both peaks.

Brazil and USA-The persistent impact of income inequality
While some factors gain importance only during specific periods of the outbreak, income inequality shows high impact on velocity during the entire outbreak. Income inequality is assessed with the GINI Index [15], a standard measure which varies from 0 (total equality) to 100 (total inequality). Inequality has many undesirable effects including significant negative impacts on public health [16], and our models reveal that countries with greater income inequality are more likely to face an increased COVID-19 death toll pace. This is the case of the two most affected countries, Brazil and the USA, two countries that suffer greatly from high income inequality.

Global overview on the importance of factors and measures
So far we have discussed how different factors impact velocity and acceleration on specific country-level models. To get a global overview on the impact that different factors have on velocity, we plotted the contribution of every factor for every country. We selected three https://doi.org/10.1371/journal.pone.0264893.g032 particular days to analyze: the peak of acceleration, the peak of velocity, and June 5th 2020 as it is the last day in our data. Fig 41 shows how the different factors are explaining velocity at the peak of acceleration on different countries. Next we discuss the role of some of these factors.
As expected, the most important factor is the demand for critical care beds. Clearly, countries showing high demands at the peak of acceleration (i.e., red dots) have experienced a large increase in velocity due to this factor. Little demands, on the other hand, usually contribute to decrease velocity. Other generally important factors include income distribution, number of beds in long-term care institutions, smoking prevalence among females, share of population infected with HIV, and relative number of cancer-related deaths among the elderly population. High values of these factors contribute to increase velocity.

COVID-19 data-driven narratives
It is worth mentioning that while we are evaluating each factor in isolation, their contributions to velocity are calculated by taking into account the complex interplay among all them. Take for instance the role of relative humidity − low relative humidity contributes to decrease velocity, while high relative humidity usually contributes to increase velocity. There are some exceptions, however, where high relative humidity contributes to decrease velocity. In order to better understand these exceptions, we extracted a heatmap from our working data showing the relationship between temperature and relative humidity, and check how their interplay ultimately impacts velocity. Fig 42 presents a heatmap showing the relationship between temperature and relative humidity. Clearly, temperatures higher than �25˚C are strongly associated with low velocities, despite the high values of relative humidity. Interestingly, authors in [17] have reported that the relationship between the annual average of temperature compensation and COVID-19 confirmed cases was approximately linear in the range of less than 25.8˚C, which became flat above 25.8˚C.
The lack of countermeasures to slow the spread of the virus is reflected in several factors occupying the middle of the heatmap. As expected, the more days without a measure the higher is the impact on increasing velocity. A particular important measure at this stage of the outbreak seems to be the adoption of coordinated public information campaign due to the need for essential information and advice right on early stages of the outbreak. Clearly, the more days with coordinated public campaigns the more velocity decreases. However, at this stage of the outbreak, countermeasures are among the factors impacting the least on velocity.
Other factors that are worth mentioning include the number of years without BCG vaccination and share of one-person households in the country. Both factors show marginal impact on velocity at the peak of acceleration and are among the ones showing small impact on velocity, but they gain importance on latter stages of the outbreak, as shown in the following. for critical care beds, number of long-term care beds in institutions, and income distribution. Factors such as number of years without BCG vaccination gained substantial importance. Specifically, some countries seem to benefit from an active national vaccination program. As recently reported in [18], the BCG vaccine has beneficial non-specific effects on the immune system that protect against a wide range of infections, and randomized controlled trials have provided evidence that the BCG vaccine's immunomodulatory properties can protect against respiratory infections.
Two factors that also gained substantial importance are the availability of nurses and the percentage of GDP expended on the health system. Specifically, velocity decreases as more nurses are available. Similarly, velocity decreases as more is spent on the health system. Finally, the more days adopting less stringent measures (i.e., closing workplaces only for some categories, narrow debt relief and banning arrivals only from some regions), the more velocity increases. The opposite trend is observed with more stringent measures (i.e., broad contract relief), that is, the more days adopting these measures, the more velocity decreases. Fig 44 shows how the different factors have impacted velocity on June 5th 2020, the last day in our working data. Interestingly, income distribution became the top factor impacting velocity. The number of nurses available, the age working dependency ratio and the share of elderly people also gained importance, and are now among the factors impacting velocity most.
Two factors are involved in an interesting contrast: the share of one-person households and the population living in slums. Both factors have similar importance at this stage of the outbreak, but they show opposite trends. One person households contribute to decrease velocity, while population living in slums contribute to increase velocity. People living together are clearly more exposed to getting infected because they share space. Larger households, which are a characteristic of urban slums, pose greater risk of infection during the outbreak because they can contribute to the rapid spread of the virus.
Finally, at this stage of the outbreak, it becomes clearer which countermeasures were effective. We consider as effective a measure which contributes to decrease velocity as it is adopted for more days. Thus, effective measures include the recommendation of not leaving house, the adoption of a broad debt relief, closing schools, limiting private gatherings to less than 10 people, and cancelling public events.

Conclusion
This study shows how concepts in model explainability can be used to facilitate reasoning about different hypotheses regarding the ongoing COVID-19 pandemic. For this, we built intuitive country-level COVID-19 motion models described in terms of death toll velocity and acceleration, which are defined as the first and second order derivatives, respectively, of the COVID-19 death toll in respect to time. The study considered a multitude of factors, and we show that both velocity and acceleration can be approximated with proper combinations of these factors. Model explanation techniques reveal the main factors driving velocity and acceleration. In general, more intricate factors such as urbanization problems, typical comorbidities, and social inequality issues are among the ones better explaining velocity. Acceleration, on the other hand, is explained by the effects of public health measures being adopted, or the lack of them. These explanations enable data-driven narratives that may help epidemiologists and policymakers to understand factors driving the outbreak, as well as to monitor the impact of public health measures on the spread of the disease. As future work we intend to implement counterfactual predictions in order to estimate what is likely to happen as a result of a countermeasure. These predictions would allow the assessment of the impact of countermeasures by means of "what-if" simulation scenarios, possibly benefiting decision making.

PLOS ONE
COVID-19 data-driven narratives Supporting information S1 Appendix. Tables 1-4 present the main factors found by Algorithm 1. However, the complete list of considered factors is much larger. Tables 7-12 show the factors not discussed previously. The names we gave to these factors are self-explanatory. (PDF) S1 Text. (TXT)